Exploring Rated Datasets with Rating Maps
نویسندگان
چکیده
Online rated datasets have become a source for large-scale population studies for analysts and a means for end-users to achieve routine tasks such as finding a book club. Existing systems however only provide limited insights into the opinions of different segments of the rater population. In this paper, we develop a framework for finding and exploring population segments and their opinions. We propose rating maps, a collection of (population segment, rating distribution) pairs, where a segment, e.g., 〈18-29 year old males in CA〉 has a rating distribution in the form of a histogram that aggregates its ratings for a set of items (e.g., movies starring Russel Crowe). We formalize the problem of building rating maps dynamically given desired input distributions. Our problem raises two challenges: (i) the choice of an appropriate measure for comparing rating distributions, and (ii) the design of efficient algorithms to find segments. We show that the Earth Mover’s Distance (EMD) is welladapted to comparing rating distributions and prove that finding segments whose rating distribution is close to input ones is NP-complete. We propose an efficient algorithm for building Partition Decision Trees and heuristics for combining the resulting partitions to further improve their quality. Our experiments on real and synthetic datasets validate the utility of rating maps for both analysts and end-users.
منابع مشابه
Finding and Exploring Rating Distributions (Technical Report)
Online rated datasets have become a source for large-scale population studies for analysts and a means for end-users to achieve routine tasks such as finding a book club. Existing systems however only provide limited insights into the opinions of different segments of the rater population. In this technical report, we assume that a segment, e.g., 〈18-29 year old males in CA〉 has a rating distri...
متن کاملA novel approach to exploring company's financial soundness: Investor's perspective
Prediction of company’s life cycle stage change; creation of an ordered 2D map allowing to explore company’s financial soundness from a rating agency perspective; and prediction of trends of main valuation attributes usually used by investors are the main objectives of this article. The developed algorithms are based on a random forest (RF) and a nonlinear data mapping technique ‘‘t-distributed...
متن کاملCo-Regression for Cross-Language Review Rating Prediction
The task of review rating prediction can be well addressed by using regression algorithms if there is a reliable training set of reviews with human ratings. In this paper, we aim to investigate a more challenging task of crosslanguage review rating prediction, which makes use of only rated reviews in a source language (e.g. English) to predict the rating scores of unrated reviews in a target la...
متن کاملFaculty Attitudes Towards Student Ratings: Do the Student Rating Scores Really Matter?
Faculty Attitudes Towards Student Ratings: Do the Student Rating Scores Really Matter? Abdolhussein Shakurnia1 Abstract Introduction: Survey on faculty attitudes towards student ratings can reveal the strengths and weaknesses of faculty evaluation and be considered as an effective measure leading to higher quality. The purpose of this study was to investigate the effect of faculty evalua...
متن کاملFast and accurate map merging for multi-robot systems
We present a new algorithm for merging occupancy grid maps produced by multiple robots exploring the same environment. The algorithm produces a set of possible transformations needed to merge two maps, i.e translations and rotations. Each transformation is weighted, thus allowing to distinguish uncertain situations, and enabling to track multiple cases when ambiguities arise. Transformations ar...
متن کامل